HMC at SemEval-2016 Task 11: Identifying Complex Words Using Depth-limited Decision Trees
نویسندگان
چکیده
We present two systems created for SemEval2016s Task 11: Complex Word Identification. Our two systems, a regression tree and decision tree, were trained with a word’s unigram and lemma word counts, average ageof-acquisition, and a measure of concreteness. The systems ranked 5th and 6th, respectively, on the test set by G-score (the harmonic mean between accuracy and recall). With the regression tree’s predictions earning a G-score of 0.766, and the decision tree’s earning 0.765, the two systems scored within 1 percent of the score of the best-performing system in the task.
منابع مشابه
JU_NLP at SemEval-2016 Task 11: Identifying Complex Words in a Sentence
The complex word identification task refers to the process of identifying difficult words in a sentence from the perspective of readers belonging to a specific target audience. This task has immense importance in the field of lexical simplification. Lexical simplification helps in improving the readability of texts consisting of challenging words. As a participant of the SemEval-2016: Task 11 s...
متن کاملMAZA at SemEval-2016 Task 11: Detecting Lexical Complexity Using a Decision Stump Meta-Classifier
This paper describes team MAZA entries for the 2016 SemEval Task 11: Complex Word Identification (CWI). The task is a binary classification task in which systems are trained to predict whether a word in a sentence is considered to be complex or not. We developed our two systems for this task based on classifier stacking using decision stumps and decision trees. Our best system, using contextual...
متن کاملSemEval 2016 Task 11: Complex Word Identification
We report the findings of the Complex Word Identification task of SemEval 2016. To create a dataset, we conduct a user study with 400 non-native English speakers, and find that complex words tend to be rarer, less ambiguous and shorter. A total of 42 systems were submitted from 21 distinct teams, and nine baselines were provided. The results highlight the effectiveness of Decision Trees and Ens...
متن کاملSensible at SemEval-2016 Task 11: Neural Nonsense Mangled in Ensemble Mess
This paper describes our submission to the Complex Word Identification (CWI) task in SemEval-2016. We test an experimental approach to blindly use neural nets to solve the CWI task that we know little/nothing about. By structuring the input as a series of sequences and the output as a binary that indicates 1 to denote complex words and 0 otherwise, we introduce a novel approach to complex word ...
متن کاملLTG at SemEval-2016 Task 11: Complex Word Identification with Classifier Ensembles
We present the description of the LTG entry in the SemEval-2016 Complex Word Identification (CWI) task, which aimed to develop systems for identifying complex words in English sentences. Our entry focused on the use of contextual language model features and the application of ensemble classification methods. Both of our systems achieved good performance, ranking in 2nd and 3rd place overall in ...
متن کامل